Posts with tag Digital Humanities


← Back to all posts
Apr 19 2022

Its not very hard to get individual texts in digital form. But working with grad students in the humanities looking for large sets of texts to do analysis across, I find that larger corpora are so hodgepodge as to be almost completely unusable. For humanists and ordinary people to work with large textual collections, they need to be distributed in ways that are actually accessible, not just open access.

Mar 28 2022

Ive never done the Day of DH tradition where people explain what, exactly, it means to have a job in digital humanities. But today looks to be a pretty DH-full day, so I think, in these last days of Twitter, Ill give it a shot. (thread)

Well start it at the beginning1:30 or so AM, finally sent out an e-mail Id been procrastinating on to the college grants administrator for a public humanities project about immigrant histories Im running with @ellennoonan and Sibylle Fischer.

Weve had NYU funding as a Bennett-Polonksy Humanities Lab (https://nyuhumanities.org/program/asylum-h-lab-2020-2021/) to this point, but presenting to the history department last month clarified the use in making one of our primary sorts of recordsA filesmore accessible to historians and family researchers.

But that will take some real institutional support, because the stuff weve obtainedlegally!from US customs and immigration in our trial run is so shockingly personal in a lot of cases that I cant really share it yet.

(Yet is the wrong wordcant ethically share in my lifetime, probably. But there are still really important reasons to work on auditing these records especially. If youre a naturalized citizen or permanent resident and want any help getting your own A-file, let me know!)

OK, skipping to about 9:50 AM. (Late start b/c the first-grader had a school event and my wife teaches Thursday AM). Todays first teaching, for my class https://benschmidt.org/WWD22 will be focused on 19C directories from the NYPL.

Nick Wolf and @bertspaan digitized these years ago, but theres more to do with them. A couple weeks ago @SWrightKennedy shared a preview of Columbias great new geolocation data about 19C New York https://mappinghny.com/about/

And yesterday I finally pushed a full pipeline bringing the last two weeks of student work together for doing geo-matching and cleaning of these to the github repo. https://github.com/HumanitiesDataAnalysis/Directories . This should allow some amazing analysis of economic geography, name types, etc.

And yesterday I finally pushed a full pipeline bringing the last two weeks of student work together for doing geo-matching and cleaning of these to the github repo. https://github.com/HumanitiesDataAnalysis/Directories . This should allow some amazing analysis of economic geography, name types, etc.

So now weve got 8.3m individual people for every year from 1850-1889 queued up and ready for a variety of analyses. I want to send the students a map to show how all their R code is paying off, but the deepscatter module is breakingonly one of the filters is working here.

I spend 40 minutes poking in the web code there to try to refactor the code to get the interface working right, but this isnt really relevant for the class right nowmore something for the summer, I guess. So I give up and decide to do this DH tweeting instead.

Because of the whole Twitter is almost over thing, but some lingering guilt about not blogging enough, I decide that a Day of DH post should really be a blog firstso lets finally structure some markdown for a twitter thread that can go on benschmidt.org.

It takes a surprising amount of mucking around with the svelte-kit settings to get things publishing correctly, and I have to remember my own markdown naming conventions. But after a few minutes, weve got full recursion. https://benschmidt.org/post/2022-03-28-day-of-dh/day-of-dh-22/

Whoops, or not Time to muck with svelte-kit a little more

Well, this is embarassing but typical. Turns out there was a bug in the bleeding-edge svelte-kit build that broke trailing slash behavior in URLs. Because https://benschmidt.org/post/2022-03-19-better-texts/ is different from https://benschmidt.org/post/2022-03-19-better-texts. Finally fixed.

Insane levels of debugging is a real pain and occupational hazard. But to be honest, I dont know how anyone could responsibly teach this stuff without doing this sort of rebuilding and rescaling all the time. Every one of those things is kind of interesting and builds up ability to fix others code

Insane levels of debugging is a real pain and occupational hazard. But I dont know how you can responsibly teach this stuff without these frequent rabbit holes. Every one of those things is kind of interesting and builds up ability to fix others code

Jun 07 2021

This article in the New Yorker about the end of genre prompts me to share a theory Ive had for a year or so that models at Spotify, Netflix, etc, are most likely not just removing artificial silos that old media companies imposed on us, but actively destroying genre without much pushback. Im curious what you think.

Dec 05 2019

(This is a talk from a January 2019 panel at the annual meeting of the American Historical Association. You probably need to know, to read it, that the MLA conference was simultaneously taking place about 20 blocks north.)

Mar 19 2019

Critical Inquiry has posted an article by Nan Da offering a critique of some subset of digital humanities that she calls Computational Literary Studies, or CLS. The premise of the article is to demonstrate the poverty of the field by showing that the new structure of CLS is easily dismantled by the masters own tools. It appears to have succeeded enough at gaining attention that it clearly does some kind of work far outsize to the merits of the article itself.

Jun 12 2015

Ive gotten a couple e-mails this week from people asking advice about what sort of computers they should buy for digital humanities research. That makes me think there arent enough resources online for this, so Im posting my general advice here. (For some solid other perspectives, see here). For keyword optimization Im calling this post digital humanities.” But, obviously, I really mean the subset that is humanities computing, what I tend to call humanities data analysis. [Edit: To be clear, ] Moreover, the guidelines here are specifically tailored for text analysis; if you are working with images, youll have somewhat different needs (in particular, you may need a better graphics card). If you do GIS, god help you. I dont do any serious social network analysis, but I think the guidelines below should work relatively with Gephi.

Apr 03 2015

Practically everyone in Digital Humanities has been posting increasingly epistemological reflections on Matt Jockers Syuzhet package since Annie Swafford posted a set of critiques of its assumptions. Ive been drafting and redrafting one myself. One of the major reasons I havent is that the obligatory list of links keeps growing. Suffice it to say that this here is not a broad methodological disputation, but rather a single idea crystallized after reading Scott Enderle on sine waves of sentiment.” Ill say what this all means for the epistemology of the Digital Humanities in a different post, to the extent that thats helpful.